Skip to content
This repository has been archived by the owner on Jul 1, 2023. It is now read-only.

Accurate Tensor.device for TFEager backends #1077

Closed
wants to merge 12 commits into from
Closed

Accurate Tensor.device for TFEager backends #1077

wants to merge 12 commits into from

Conversation

texasmichelle
Copy link
Member

@texasmichelle texasmichelle commented Sep 11, 2020

This modifies Tensor.device for TFEager backends to retrieve device details from the underlying eager libraries. This is more accurate than returning Device.defaultTFEager and reflects the actual location of eager operations.

This was tested by building a CUDA toolchain and verifying that TFE_TensorHandleDeviceName() returns:

/job:localhost/replica:0/task:0/device:GPU:0

and a GPU device is returned by Tensor.device:

Device(kind: .GPU, ordinal: 0, backend: .TF_EAGER)

On a device without GPU, TFE_TensorHandleDeviceName() returns:

/job:localhost/replica:0/task:0/device:CPU:0

and a CPU device is returned by Tensor.device:

Device(kind: .CPU, ordinal: 0, backend: .TF_EAGER)

Fixes tensorflow/swift#524.

@texasmichelle texasmichelle changed the title Eager device Accurate Device.defaultTFEager Sep 11, 2020
Sources/x10/swift_bindings/Device.swift Outdated Show resolved Hide resolved
Sources/x10/swift_bindings/Device.swift Outdated Show resolved Hide resolved
// Parse type and ordinal from a string with the expected syntax:
// /job:localhost/replica:0/task:0/device:CPU:0
let pattern = ".+device:(.+):(\\d+)$"
let regex = try! NSRegularExpression(pattern: pattern)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This string parsing looks expensive. Might want to double check that this doesn't happen too much.

@texasmichelle texasmichelle changed the title Accurate Device.defaultTFEager Accurate Tensor.device for TFEager backends Sep 11, 2020

// Parse type and ordinal from a string with the expected syntax:
// /job:localhost/replica:0/task:0/device:CPU:0
let pattern = ".+device:(.+):(\\d+)$"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe break this String -> Device out as a separate function?
Also, I'm concerned that the string parsing will be expensive. This function is called a lot. (whenever there is a scalar constant or anything like that). I think it would be best to see if you can add your own TFE_TensorHandleDevice_Type and TFE_TensorHandleDevice_Id. Some benchmarking results might work instead.

@texasmichelle
Copy link
Member Author

Different approach coming. Closing this PR

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Eager tensors always report being on CPU Device despite documentation
2 participants